arXiv:1504.01563vl [cs.CY] 7 Apr 2015 


"How much?" Is Not Enough 
An Analysis of Open Budget Initiatives 


Alan Tygel Judie Attard 

Graduate Program on University of Bonn, Germany 

informatics - ppgi - UFRJ, attard@iai.uni-bonn.de 
Brazil 

alantygel@ppgi.ufrj.br 


Fabrizio Orlandi 
University of Bonn, Germany 
orlandi@iai.uni-bonn.de 


Maria Luiza Machado 
Campos 

Graduate Program on 
Informatics - PPGI - UFRJ, 
Brazil 

mluiza@ppgi.ufrj.br 


Soren Auer 
University of Bonn and 
Fraunhofer IAIS, Germany 

auer@cs.uni-bonn.de 


ABSTRACT 

A worldwide movement towards the publication of Open Govern¬ 
ment Data is taking place, and budget data is one of the key ele¬ 
ments pushing this trend. Its importance is mostly related to trans¬ 
parency, but publishing budget data, combined with other actions, 
can also improve democratic participation, allow comparative anal¬ 
ysis of governments and boost data-driven business. However, the 
lack of standards and common evaluation criteria still hinders the 
development of appropriate tools and the materialization of the ap¬ 
pointed benefits. In this paper, we present a model to analyse gov¬ 
ernment initiatives to publish budget data. We identify the main 
features of these initiatives with a double objective: (i) to drive a 
structured analysis, relating some dimensions to their possible im¬ 
pacts, and (ii) to derive characterization attributes to compare ini¬ 
tiatives based on each dimension. We define use perspectives and 
analyse some initiatives using this model. We conclude that, in or¬ 
der to favour use perspectives, special attention must be given to 
user feedback, semantics standards and linking possibilities. 

General Terms 

open government data, open budget initiatives, e-government, par¬ 
ticipation, transparency 

1. INTRODUCTION 

In the last six years, a worldwide movement towards the publica¬ 
tion of Open Government Data (OGD) has been taking place. The 
aims and scope of OGD initiatives in each country are diverse and 
we can count almost 100 countries publishing some kind of OGE0 

'According to the Open Data Index: http://index.okfn. 
org/ 
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The motivation for governments to publish OGD are also di¬ 
verse. It ranges from the democratic point of view with increas¬ 
ing government transparency and citizen participation to the more 
economic motivation of fostering new data-driven businesses. The 
strengthening of law enforcement has also fostered OGD publish¬ 
ing fTO| . 

A large number of stakeholders may take part of the OGD 
ecosystem, namely: as data providers, different levels of public 
administrations (including local, regional, national and transna¬ 
tional), and citizens, and as consumers, civil society initiatives and 
NGOs, companies, journalists and media organisations. While data 
providers mostly play the specific role of publishing data in an open 
format, other stakeholders participate in this initiative in a number 
of ways, including viewing the open data, sharing feedback, and 
exporting data into their own systems. It is also expected that these 
stakeholders behave as prosumers, not only passively consuming 
data, but also interfering in its production and publication. 

OGD can be related to a diversity of themes. Education, crime, 
health, transportation and company registration are common sub¬ 
jects. However, one type of data is of particular importance: gov¬ 
ernment budgetary data, as timely access to these data is critical to 
accomplish government accountability. 

All governments and public administrations maintain budgetary 
data, unlike, for example, bus position data, which depends on sen¬ 
sors, or data about the occurrence of a specific disease, which de¬ 
pends on a health information system. From the citizen side, in¬ 
formation on budget is a key element to ensure that public funds 
are being properly used. In locations where a participatory budget 
was implemented, that is, part of the budget allocation is decided 
by the community, access to this kind of data is indispensable. A 
global initiative to improve openness of governments - the Open 
Government Partnership (OGP) - has the fiscal transparency as a 
minimum eligibility criteritj^j characterizing budget data as a foun¬ 
dation of open government. 

Even with so many possible positive impacts, existing public fi¬ 
nancial transparency portals suffer from a number of shortcomings. 
First of all, they suffer from the large number of diverse data struc¬ 
tures that make the comparison and aggregate analysis of transna¬ 
tional financial flows practically impossible. The tools to present, 
search, download and visualise this financial data are also nearly as 
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diverse as the number of existing portals. This heterogeneity |24| 
may even prevent an analysis of the quality of the data for the same 
funds administered by different funding authorities. Past efforts 
have sought to overcome this situation by creating comprehensive 
and connected transparency portals, such as Farmsubsidy.org, and 
more recently, Publicspending.net. 

Within the existing open budget initiatives, low user engagement 
has been reported ]28) . Moreover, most of the budget publishing 
efforts result in simple data catalogues, fragmented and dispersed, 
because they do not share standards and methodologies |24| . The 
absence of standards can lead to data misuse (30), or even to results 
opposed to the initial aims (8). 

The basis for such standards has to be set. Together with other 
ongoing initiatives |15| |26) , we believe that the development of 
a solid standard can help governments to make their budget data 
more usable, and thus enable citizen participation in the democratic 
process. In this article we define a structured analysis framework 
for budget data, which can help developers and policy makers to un¬ 
derstand the importance of various aspects of budget data publish¬ 
ing and to develop more adequate budget publishing systems. After 
defining some foundational concepts (Section [2), highlighting the 
importance of budget data (Section[3j and discussing related work 
(Section [4), we describe the chosen methodology (Section [5]t and 
derive dimensions and characterization attributes, based on three 
use perspectives (Section |6j. These characterization attributes are 
applied to 23 open budget initiatives (Section [7j. and results are 
discussed (Section[8]l. 

2. WHAT IS BUDGET DATA? 

Open Budget Data is the topic of a few recent publications |26| 
|23| |14| 1 1 6| [4j. Nevertheless, it is important to establish a com¬ 
mon ground to some basic concepts, as they have not always a sin¬ 
gle widely accepted definition. Here, we propose definitions for 
Budget, Spending and Revenue, as the main quantities tackled, and 
Open Budget Data as the general ternj^] 

DEFINITION 1. Budget is the description of the amount of 
money planned to be spent in a specified time period. Budget de¬ 
scriptions can refer to several levels of specificity, from general (to¬ 
tal amount to be spent) to specific (amount by area, or category). A 
budget description can be characterized by: 

• (i) the scope, that is, the corresponding administrative level 
(municipality, region, country etc.); 

• (ii) Optionally, a domain, such as healthcare, public trans¬ 
portation; 

• (Hi) if applicable, the related location (region, city, neigh¬ 
bourhood, or latitude and longitude) and 

• (iv) a period of time. 

Budget comprises a set of budget items which have a budget cate¬ 
gory and an associated amount with a currency. Categories can be 
organized hierarchically, where higher levels of the hierarchy are 
representing aggregations of the lower levels. There are different 
types of budget, such as proposed, planned, and certified, which is 
presented after the budget term. Budgets may also receive amend¬ 
ments during their associated term. 

3 A further discussion about these terms can be found in 

http://community.openspending.org/research/ 
handbook/types-of-spending-data/ 


DEFINITION 2. Spending, or expenditure, refers to the amount 
of money actually spent by the public administration. It can also be 
seen as the realisation of the budget. Government spending can be 
split in four main categories: 

• (i) Transfer payments, related to social benefits as pension, 
housing, or floor income for low income households; 

• (ii) Current government spending, related to the costs of 
maintaining the government structure, mainly public em¬ 
ployees salaries; 

• (iii) Capital spending, which goes for building infrastucture, 
as roads, hospitals, schools etc; and 

• (iv) Financial costs, as internal and external debt services. 

Ideally, spending should be published in the finest grain: transac¬ 
tions, which is the description of every payment, including value, 
time period and recipient. Transactions should also be classi¬ 
fied according to properly defined criteria to generate aggregate 
amounts. These criteria are the same as specified for the budget: 
scope, domain, place and time. There exists also different types 
of spending, such as planned (according to the budget), authorized 
(payment order) and executed (money transferred from government 
to the recipient). 

DEFINITION 3. Revenue is the amount of money received by 
a government administration. Revenues can have several types of 
origins, such as taxes (revenue, commercialization), service fees 
(transportation), royalties (oil and mine exploration), concessions 
(roads, electromagnetic spectrum) or financial operations. Pre¬ 
dicted revenues, used to specify the budget may differ from the ac¬ 
tual revenues. 

DEFINITION 4. Open Budget Initiative refers to any portal or 
application which publishes budget, spending and/or revenue data, 
that allows the civil society - IT experts or not - to access those 
data. It may comprise one or many datasets, which can be down¬ 
loaded in several formats or directly visualized in tables, charts or 
maps. The model presented in Section^describe an Open Budget 
Initiative in further details. 

3. WHY BUDGET DATA? 

The importance of publishing government budgetary data can be 
summarised in five key elements: 

Transparency: Opening budget data unveils public funds man¬ 
agement. This increases accountability and therefore augments 
citizen's trust in public administration, whilst having a potential 
of uncovering hidden transactions and thus preventing corruption. 
An important factor which can stimulate corruption is the fact that 
funding goes through the hands of public officials without further 
scrutiny. In European Union Member States, this is particularly 
evident within public procurement, which is prone to corruption 
owed to deficient control mechanisms (6). Essentially, such acts are 
concealed from the public eye. Supporting financial transparency 
enhances accountability within public sectors and, as a result, pre¬ 
vents corruption. 

Participation: Opaque regimes may compel citizens to engage 
against the government. A transparent public administration, on the 
contrary, can stimulate social participation in community enhance¬ 
ment. Open budget initiatives can not only enable meaningful civil 
society scrutiny of transnational financial flows, but they can also 



provide platforms for stakeholders to develop benchmarks that cre¬ 
ate pressure on public authorities to provide data in a timely, com¬ 
parable, re-useable and well-structured fashion. These platforms 
can also involve local citizens in the budget planning and auditing 
phases, by allowing them to interact with the process, providing 
opinions and suggestions on setting budget priorities, providing 
feedback on the published transactions. A virtuous circle can be 
created, in which both public officials and civil society will realise 
the value of data and analysis tools, in a collaborative environment 
open to contributions and engagement. 

Comparative Analysis: Well organized budget data facilitates re¬ 
searchers and policy makers to compare spending strategies be¬ 
tween cities, states and countries, and also among different admin¬ 
istration levels. Visualisation, analytics and exploration tools can 
offer different stakeholders an opportunity to scrutinize and inter¬ 
pret financial data related to a region of interest. It also allows to 
compare allocations and transactions between multiple regions, to 
visualise detected trends and budget projections and to investigate 
anomalies and activities, which have been flagged as suspicious. A 
necessary condition for that is the compatibility and consistency of 
data from different data sources. 

Efficiency and Effectiveness: Efficiency of public spending can 
be assessed by comparing, for example, the cost per kilometre of a 
railway. The effectiveness can also be assessed, in this case, by the 
revenues generated with the railway. 

Business Value: It has been recently stated that "Open data can 
help unlock U$3 trillion to U$5 trillion in economic value annu¬ 
ally" |13) . Publishing budget data can stimulate the creation, deliv¬ 
ery and use of new services on a variety of devices, utilising new 
web technologies, coupled with open public data. These services 
include visualisation services and data discovery services, such as 
data mining and comparative analysis, which enable stakeholders 
to explore the data, identify patterns, as well as potentially fore¬ 
casting budget and transaction trends. Budget data can also gen¬ 
erate value by empowering journalists when they report on spend¬ 
ing items. Accurate information on public funds usage may enable 
content producers to create better articles. 


4. RELATED WORKS 

A number of recent works proposed frameworks, impact mea¬ 
sures or comparison criteria on the general open data domain. 
Some of them aim the comparison of e-government and open data 
policies |29||25| . In (22| , a framework is proposed to evaluate OGD 
initiatives, pointing also to the development of impact metrics. 

A theoretical background to analyse the impact of OGD was de¬ 
veloped in |7}. Impacts are divided into economical, political and 
social, and for each of them, possible implementation issues and 
impact metrics are deeply discussed. Recently, a working group 
was created to develop methods for assessing open data. In their 
first report (3), a draft of a framework is proposed. 

Automatic benchmarking techniques are proposed in (T). De¬ 
spite enabling large scale with low cost and high frequency eval¬ 
uations, automatic assessment can miss some political and social 
aspects of open data. 

Even though structured analysis and comparison of open bud¬ 
get initiatives have not received much attention from the literature, 
two works must be highlighted. The Open Budget Survey GD is a 
research project that, every two years, "measures the state of bud¬ 
get transparency, participation, and oversight in countries around 


the world". It generates the Open Budget IndejQ which is updated 
monthly, and is based on the publication of eight key budget docu¬ 
ments. Despite being a very useful comparison tool, this methodol¬ 
ogy does not evaluate information systems used to publish budget 
data, which are the way how the information reaches the society. 

An evaluation and comparison between almost 30 Brazilian gov¬ 
ernment transparency portals, on several administration levels, is 
presented in 1 21 The analysis was based on the 8 Open Govern¬ 
ment Principled] evaluated for each portal by experts. Despite be¬ 
ing a well defined and wide accepted model, these principles are 
quite general, and do not refer to specific characteristics of budget 
data. Moreover, they cover basically the publisher side. 

5. METHODOLOGY 

The research approach used to develop this model was inspired 
by the observation, induction and deduction method used in |29| . 
After analysing the related bibliography and observing some ran¬ 
domly collected open budget initiatives, we used an inductive rea¬ 
soning to build the first approach to the model. The model is a set of 
dimensions, which represent different themes to be assessed in an 
open budget initiative. Dimensions are grouped in parts, according 
to its general functions. 

The same basis is used to define use perspectives (UP), which 
represent different ways of using budget data. From the UPs, we 
extract related requirements. 

Model and use perspectives were then applied to other open bud¬ 
get initiatives in a deductive reasoning, in order to verify the fitness 
of the dimensions, and the coverage of the use perspectives. Miss¬ 
ing items were added to the model and to the use perspectives, and 
the feedback loop was run until no significant changes were found. 
Finally, use perspectives were checked against the model, in order 
to verify the correspondence between model dimensions and use 
perspectives. This correspondence is materialized in the character¬ 
ization attributes 

The result of this observation, induction and deduction approach 
is described in the next section. 

6. A MODEL TO ANALYSE OPEN GOV¬ 
ERNMENT BUDGET DATA PORTALS 

The main objective for building this model is the need for a 
mechanism to assess different strategies for publishing budget data. 
We do not aim to build rankings, but rather to systematize open 
budget initiatives in order to assess their fitness to specific use per¬ 
spectives. A general overview of the proposed model is depicted in 
Figure[T] The model consists of four parts: 

1. Context, referring to external aspects related with the initia¬ 
tive; 

2. General Aspects, referring to the overall characterization of 
the initiative; 

3. Data Publishing, referring to aspects specific to data publish¬ 
ing process; and 

4. Data Consumption, referring to aspects specific to the data 
consumption process. 

Naturally, there is a strong coupling between these parts. The 
way data are published affects directly the consumption. By the 
same reasoning, the feedback generated by users (should) affect 

4 http://www.obstracker.org/ 
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Figure 1: Model to analyse open budget initiatives. The four parts - General Aspects, Publishing, Consumption and Context - are 
interconnected, and composed by several dimensions. Icons made by Flaticon (CC) 


data publishing. The context particularly impacts the general as¬ 
pects, but also influences the other parts. 

The context part represents the environment in which the open 
budget initiative is involved. It stands for the open data policies and 
legislation which rules the publication of spending data, and also 
the government initiatives to promote the use of data, either only by 
advertising, or more incisively promoting data literacy. Although 
we recognize that the context is a key element for the success of 
an open budget initiative, we will not consider it in the scope of 
this paper because its complexity would make the first approach to 
an objective model unfeasible. For the time being, we will focus 
on the general aspects directly related to the initiative, and on the 
issues related to publishing and consuming data. 

Each part of the model is composed of several dimensions, which 
will be assessed through Characterization Attributes'. 

DEFINITION 5. Characterization Attributes are features of 
open budget initiatives that: (i) are objectively assessable; (ii) ex¬ 
pect qualitative values; and (Hi) have direct impact on the realisa¬ 
tion of use perspectives. 

The characterization attributes derived from the dimensions are 
summarised in Table |T] 

Characterizing an open budget initiative is the first step in order 
be able to assess quality. The term quality may refer to different 
concepts. In this work, we define quality as the conformance to 
requirements, which in our case are those associated to the use per¬ 
spectives. In other words, we can say that quality is the fitness for 
use. Thus, we define three use perspectives, from which we extract 
some requirements: 

UP1 - Transparency: Journalists, software developers, NGOs, 
and grass-roots movements use budget data to audit government 
and to translate data into more accessible formats for the society. 
For this use case, detailed data (i.e., transaction level), consistent 
classification levels, and machine readable formats are some im¬ 
portant requirements. Discussion and feedback on the provided 
data are also requirements in this case, for example, for suggesting 


different priorities for budgeting, or discussing a particular trans¬ 
action. Both citizens and public administration benefit from this 
feature since the citizens (or other stakeholders) can show their 
perspectives and the public administration entity would check the 
current priorities to see if they need to be amended. 

UP2 - Participation: For the last two decades, cities from all over 
the world have been implementing participatory budgeting (PB) ex¬ 
periences with different systems and procedures. Research shows 
how developing and promoting PB digital solutions can increase 
civic engagement up to seven times (U- In Europe, digital so¬ 
lutions to promote citizen engagement in budget creation include, 
for example, sending proposals by email, participating in online 
forums and discussion, subscription to SMS updates and video 
streaming (TV). Germany presents one of the most advanced digital 
solutions to engage citizens, as shown in the participatory budget¬ 
ing portal of the city of Freiburg] The participation use perspective 
will be exemplified by the PB case. PB members must have access 
to accurate and easily understandable budget data. Through this 
perspective, design, usability, and human readable formats are the 
most important requirements. Hierarchically aggregated categories 
also play an important role. 

UP3 - Policy Making: If adequately published, budget data can be 
used to compare the way each government manages public funds. 
Researchers and policy makers should be able to compare the bud¬ 
gets and spending data between (i) different public administrations 
(e.g. Cologne vs Munich); or (ii) different periods (e.g. year 2013 
vs year 2014), and thus relate spending strategies to political, eco¬ 
nomical and social outcomes. Comparing spending profiles among 
governments requires the use of common classifications, vocabu¬ 
laries and ontologies, and the possibility of linking data with other 
databases, as, for example, multinational enterprises data [24) . In 
order to enable the integration of the corresponding budget data on 
the different public administration contexts, a semantic data model 
for budgets and spending has to be defined. In this case, publish- 

f http://www.beteiligungshaushalt-freiburg.de 










































ing financial data in a reusable, machine-processable, linked-data 
format can enable integration and reuse across multiple sources. 
The use of a standard format also facilitates the comparison of data 
from different municipalities or regions. More importantly, it al¬ 
lows all the stakeholders involved or interested in budget planning 
or spending, to manipulate data using the same tools and meth¬ 
ods, thus supporting financial transparency in public budgeting and 
spending. This may allow the creation of visualisations and com¬ 
parative data analyses for the discovery of trends. Stakeholders 
will therefore be able to view and compare allocated budgets and 
transactions, and give feedback on each item. This feedback can 
then be shared through social media and also be directly exploited 
by governments and public administrations to achieve better bud¬ 
get management. The latter two stakeholders will thus benefit from 
receiving targeted suggestions, comparative benchmarks and sce¬ 
narios. 

In the remainder of this section, we explain each part of the 
model, by defining its dimensions, explaining their importance, and 
proposing characterization attributes (summarised in Table 0> in 
order to assess the fitness to each use perspective. We define user 
as any of the stakeholders aiming to consume data from an open 
budget initiative. 

6.1 General Aspects 

6.1.1 Objective 

Motivations to publish budget data, or generally open data, can 
be very diverse. In Section [T| we listed five common reasons for 
publishing budget data: transparency, participation, comparative 
analysis, efficiency and effectiveness assessment, and generating 
business value. Defining the aimed audience is also important, 
since different user profiles require different approaches. For exam¬ 
ple, in UfjT| detailed data in machine readable formats is desirable, 
while for Ulj2] human readable charts and tables are most suitable. 
A SPARQL endpoint could better fit the needs of Ulj3] 

DEFINITION 6. The Objective dimension represents the moti¬ 
vations alleged for publishing budget data, including the definition 
of the intended audience. 

Characterization attributes: We define as a characterization at¬ 
tribute: (i) whether an initiative states clearly its objective (C^T), 
and (ii) whether the intended audience is explicitly defined (CApt. 

6.1.2 Content 

Open budget initiatives are very heterogeneous regarding to the 
presented content. Data can refer to several administration levels 
(local, regional, national), and also to the different power instances 
(Executive, Legislative or Judiciary), according to the political sys¬ 
tem of each country. 

DEFINITION 7. The Content dimension has the objective of as¬ 
sessing the nature of the information contained in an open budget 
initiative. 

Characterization attributes: The first important distinction we 
want to highlight is whether the initiative is exclusively for pub¬ 
lishing budget data, or it contains other kinds of information (CA^3j. 
Then, we also distinguish primary sources of data from applications 
working over data published by other initiatives, that is, secondary 
data (CA[4]l. Finally, we assess the scope of the initiative (CA^5]>, 
classifying it into local (1), regional (2), national (3) or transna¬ 
tional (4) range. A special sign identifies initiatives focused only 


on the legislative power (L), considering that initiatives, normally 
exhibit general budget data. We also consider that the scope can be 
generic (5), when the initiative allows publishers to display differ¬ 
ent datasets, referring to different scopes. 

6.1.3 Responsibility 

Publishing budgetary data implies a great responsibility of peo¬ 
ple in charge of the initiative. This kind of information is quite 
sensible, and mistakes can lead to severe consequences. Govern¬ 
ment, as supplier of primary data, may define specific sectors to 
be responsible for publishing budget data. In the US, responsi¬ 
bility is under the General Services Administration, while in UK, 
there is a Transparency and Open Data team under the Cabinet Of¬ 
fice. In Brazil, administration is under the Ministry of Planning, 
Budget and Management. Organization of civil society also play 
an important role by building applications over primary data, spe¬ 
cially regarding Ulj2] In this case, responsibility lies in making the 
context clear, and simplifying as much as possible for data to be 
understood, but as little as possible to avoid misinterpretations. 

DEFINITION 8. The Responsibility dimension of an open bud¬ 
get initiative refers to the person(s) or organization(s) responsible 
for publishing the data, from operational tasks up to guaranteeing 
the authenticity of the provided information. 

Characterization attributes: We define, as a characterization at¬ 
tribute, the distinction between data provided by governments and 
by society (CA{6). We also consider the possibility of a joint gov¬ 
ernment/society partnership. 

6.2 Publishing 

Actions to be taken referring to these dimensions are expected 
from data publishers, supposedly influenced by data consumers. 

6.2.1 Data 

While the Content dimension ( |6.1.2| l aimed to deal with general 
aspects related to the content of an open budget initiative, the Data 
dimension focuses on specific aspects. 

DEFINITION 9. The Data dimension represents specific aspects 
of the data content and determines what kind of information is pos¬ 
sible to be extracted from an open budget initiative. 

Characterization attributes: In order to characterize the data con¬ 
tent, we define three characterization attributes: (i) Measures , i.e., 
the types of represented quantities, which can be budget, spendings 
and/or revenues (CA0; (ii) Dimensions, i.e., how the measures are 
qualified, which can be time, space and/or other categories (CA[8]l; 
and (iii) Granularity , i.e., the finest level of detail available: trans¬ 
action or aggregate (C^9}. For all CAs in this dimension, we also 
accept the generic value, when the options are not predefined and 
several datasets in the same initiative present different settings. 

6.2.2 Formats 

When data are offered for download, the format in which they 
are encoded plays a very important role. For Ul|T| data in ma¬ 
chine readable formats are crucial. For Ul{3] unique identification 
of entities and relations is also very important. The semantic re¬ 
sources generated by open budget initiatives can be instantly ready 
for reuse, when resources follow Linked Open Data (LOD) prin¬ 
ciples and guidelines j9j. In this case, all URIs must be resolv¬ 
able and dereferenceable. These resources shall be accessible via 
a S PARQ iQ endp oin t, or by directly resolving resource URIs. The 

7 SR4RQL is a set of specifications to query and manipulate RDF 





Table 1: Model parts, dimensions and characterization attributes defined to characterize an Open Budget Initiative. 


Model Part 

Dimension 

Characterization Attribute 

Possible Values 


Objective 

CA1: Is the objective clearly stated? 

Yes/No 


CA2: Is the intended audience defined? 

Yes/No 


General 


CA3: Are data exclusively on budget? 

Yes/No 

Content 

CA4: What is the source of data? 

Primary Source/Secondary Source 



C ountry/Regional/Loc al 

CA5: What is the scope covered by the strategy? 

Transnational/Generic, and Legislative 



Responsibility 

CA6: Who is responsible for the strategy? 

Govemment/Society/Both 



CA7: What measures are available? 

Budget/Spending/Revenues/Generic 


Data 

CA8: What dimensions are available? 

Time/Place/Payer/Payee/Category/Generic 



CA9: What is the finest data granularity? 

Transaction/Aggregate/Generic 

Publishing 

Formats 

CA10: Which formats are available? 

Five Stars of Open Data 

Metadata 

CA11: Are metadata available? 

Yes/No 



Semantics 

CA12: Is any ontology or vocabulary used? 

Yes/No 


Access 

CA13: How are data made available? 

Catalogue/Raw Data/Querying 




System/Stories/Infographics 


License 

CA14: Are data licensed? 

Yes/No 

Consumption 

Usability 

CA15: What software tool is used? 

CKAN/OpenSpending/Other 

Feedback 

CA16: Is it possible to give feedback over data? 

Comments/Data Request/Issue Reporting 



latter must return either an RDF representation of the resource, or 
a more eye-friendly HTML visualisation, according to the nego¬ 
tiated content-type. It is of utmost importance that the resources 
are available on a stable server. This is also important as these 
resources could be linked to others in the LOD cloud. The result¬ 
ing data, which will be in a standard interoperable format (RDF), 
will be fully compliant with the statement for best practices given 
by the G8 Science Ministers G3 : "Data should be easily discov¬ 
erable, accessible, assessable, intelligible, useable, and wherever 
possible interoperable to specific quality standards". Due to LOD 
being a widespread initiative, existing tools can be exploited and 
used in order to reuse datasets. 

DEFINITION 10. The Formats dimension represents the type of 
formats in which downloadable data are offered by an open budget 
initiative. 

Characterization attributes: Here, we adopt the well established 
open data five stars modej^Jas characterization attribute (O®. 

6.2.3 Metadata 

Adequate metadata are fundamental for providing complemen¬ 
tary information about the context in which data are immersed. In¬ 
graph content on the Web. More on http://www.w3.org/ 
sparql/ 

l ' http ://5 stardata. info/ 


formation such as dataset author, published date and last update, 
formats and license are usually the basic metadata. Another use¬ 
ful class of metadata is provenance. Provenance metadata describe 
the transformations applied to the dataset, and can also explain the 
process through which each data item was generated. 

DEFINITION 11. The Metadata dimension refers to the avail¬ 
ability of descriptors associated to the provided datasets. 

Characterization attributes: As a characterization attribute, we 
check for the existence of metadata in an open budget initiative 

(CA{TTJ- 

6.2.4 Semantics 

In order to be correctly interpreted, data must be contextual¬ 
ized to avoid problems that emerge from terminology ambiguity 
or lack of agreement. Without post-hoc unification the data may 
be difficult to understand, as their users may need to familiarize 
themselves with different terminologies for each dataset. Having 
a single data format may solve structural heterogeneity, at the ex¬ 
penditure of the cost of introducing yet another format bridging 
the others. A more complex issue refers to semantic heterogeneity, 
which may be addressed by simpler solutions based on vocabular¬ 
ies to more comprehensive approaches based on ontologies. It is 
therefore fundamental not to multiply the competing approaches 
for modelling public budgets and spending data, but rather build on 





















previous work, such as G3> and align divergent approaches using 
links and semantic relations. 

The current repositories of public finance data, such as Open- 
Spending.org, serve well as data catalogues, in which each dataset 
exists more or less in isolation as a separate black box. The ab¬ 
sence of links and explicit semantics forms a barrier to automated 
processing, combining, and joining datasets of distinct origin. In 
the context of such tasks, applying linked data and semantic web 
technologies offers greater data interoperability. 

Nevertheless, perhaps the most important are the benefits of 
linked data for improving data interpretation. The key to such 
improvement comes from the recognition that measures in pub¬ 
lic budget and spending data are relative. If there is no way to 
compare them and put them into context, it is difficult to make 
sense of the data. Putting money into a wider context, on which it 
was spent, helps to perform meaningful analyses and find compre¬ 
hensible "stories" in data. The context may be provided by linked 
datasets, such as population statistics. Added links to external data 
can link public finance with the LOD cloud, offering many ways to 
view data given different contextualising information, such as eco¬ 
nomic indicators or demographic statistics. Ultimately, a key goal 
of the proposed data model is to enable better comprehension of 
public finance data. 

For Ulj3] following semantic standards is mandatory. Even 
though budget data tends to be very heterogeneous, especially be¬ 
tween different countries, some common points can be found, for 
example spending categories (Health, Education, Debt Services) 
or international companies. Budget ontologies regarding specific 
countries have been developed |20[|18| , and even an international 
effort is in course HD- Although not providing immediate linking 
possibilities, following standards as the Special Data Dissemina¬ 
tion Standard G3 helps to make data comparable. 

DEFINITION 12. The Semantics dimension refers to the sup¬ 
port of any terminological complementary resource that allows a 
better understanding of the data domain concepts. 

Characterization attributes: We define the Semantics character¬ 
ization attribute as a boolean value, that indicates the presence of 
standardized vocabularies or ontologies (CA|12[> in the open budget 
initiative. 

6.2.5 Access 

The simplest way of publishing budget information is by offering 
data for download, which can be done in several formats. However, 
in Ul|2] interactive charts, maps or infographics are more useful 
than downloadable datasets, even if this might not be considered 
open data in the strict sense. Thus, the Access dimension aims to 
check the adequacy between the desired audience and the way data 
are offered. 

DEFINITION 13. The Access dimension refers to how the ini¬ 
tiative presents budget data to its audience. 

Characterization attributes: Data Access is a characterization at¬ 
tribute (0^1 3[ which can be assigned as: 

• Downloadable data, Linked Data/SPARQL endpoint; 

• Data and metadata catalogue; 

• Exploration by Tables; 

• Visualization by Charts, Maps, Comparison; and/or 

• Stories 


6.2.6 Licensing 

Licensing is a fundamental issue for data reuse. In UI0 some 
kinds of use can be hindered by the absence of adequate licens¬ 
ing, for example, the development of derived applications. Cur¬ 
rently, 3 types of general licenses for open data are availabl^] Pub¬ 
lic Domain Dedication and License (PDDL), Attribution License 
(ODC-By), and Open Database License (ODC-ODbL). Some gov¬ 
ernments developed their own open data licenses, for example, Ger- 
man£]and ( >k(_] 

DEFINITION 14. The Licensing dimension assesses the legal 
status of data available in an open budget initiative. 

Characterization attributes: We define a boolean characteriza¬ 
tion attribute to describe the existence of a license (CA|14[ on data 
published by an open budget initiative. 

6.3 Consumption 

The justification and characterization attributes identified in this 
paper aimed at the success of the use perspectives. In this part, 
we detail specific issues related to actions to be taken by the users, 
when interfacing with budget data. 

6.3.1 Usability/Design 

A good set of visualisations, which are self explanatory and easy 
to understand, certainly can improve usage of an open budget ini¬ 
tiative. Interactive visualisations and infographics can also enable 
a stakeholder to focus on a particular aspect of the data. In |27| , 
impacts of usability and design issues are discussed. The exper¬ 
iments showed how improvements on design led to better results 
with users. 

Several aspects of this dimension overlap with dimensions of the 
Publishing part. Particularly, different ways of accessing data ( Ac¬ 
cess dimension) heavily impact usability, and exporting data in dif¬ 
ferent formats ( Formats dimension), such as CSV. XML or RDB is 
also important to encourage the reuse of data. Thus, the way data 
are published can enable stakeholders to get the most out of the 
open data. 

DEFINITION 15. The Usability/Design dimension verifies if the 
initiative interface is suitable to the requirements of the use per¬ 
spective. 

Characterization attributes: The complexity of analysing user 
interfaces surpasses the scope of this paper. Nevertheless, we define 
a characterization attribute related to the software tool used by the 
initiative CCA[l5j, understanding that the tool behind the initiative 
plays an important role on the usability. Possible values are the 
two major open source software tools available for publishing open 
data: OpenSpending and CKAN 

6.3.2 Feedback 

In order to enable the collaboration between the public sector 
administration and the other stakeholders, open budget initiatives 
have to provide means to discuss and give feedback on the provided 
data. This feedback might be provided to the public administrators 
either as comments or as a set of recommendations. For example, 
NGOs could give feedback on what should the budget focus on 
their practice area. Ideally, this communication process should be 

9 See more at: http://opendatacommons.org/ 

licenses/ 

11 https : / /www. qovdata . de/dl-de/by-1- 0 
1] http://www.nationalarchives.gov.uk/doc/ 
open-government-licence/version/3/ 




transparent, that is, feedback and recommendations given to pub¬ 
lic administrators should be publicly available and any changes, 
resulting from the feedback, should be recorded. 

The importance of stimulating user engagement on open data 
initiatives through feedback and collaboration has been stressed by 
the Five Stars of Open Data Engagement model J3). This model 
justifies the necessity of data being demand driven, contextualized, 
and collaborative. The conversation around data is also pointed as 
a strategy to engage users. According to this model, data should 
be regarded as a common resource, what enforces the necessity of 
collaboration. The lack of collaboration has been listed by |29| 
as one of the main factors hindering the development of open data 
policies. 

To enable collaboration, tools to allow feedback on budget al¬ 
locations and specific expenditure transactions should be provided 
to stakeholders. Public administrations must have the instruments 
to receive and effectively manage this feedback, enabling greater 
degrees of active citizen involvement and participation. 

DEFINITION 16. The Feedback dimension represents the 
user’s capacity to collaborate in data publishing and express 
her/his opinion. 

Characterization attributes: Although this point requires a 
deeper analysis, we noticed that many open budget initiatives do 
not present any feedback support. Thus, we define one basic binary 
characterization attribute which is the existence of feedback mech¬ 
anism (C^T6). We check if it is possible to: (i) comment on data; 
(ii) submit a new data request; and (iii) report issues noticed in data 
analysis. 

7. ANALYSIS OF OPEN BUDGET DATA 
INITIATIVES 

In this section, we describe the application of the model to a 
number of open budget initiatives. The goal of this evaluation is 
not to be extensive or to achieve statistical significance, but rather 
to test the model, to discover its potentials and limitations, and to 
gain some intuition on the domain. Results are shown in Table [2] 
and data can be accessed at http: / /bit. ly / IFNThhH 

The 23 initiatives were chosen considering a balance between 
primary (IT) and secondary (12) sources (C/j4]l. The sample also 
contains at least five initiatives strongly related to each use perspec¬ 
tive, and considers initiatives from 6 countries plus the European 
Union, presented in five different idioms. Some of the analysed 
initiatives are listed on the Map of Spending Project^] 

All primary sources are maintained by the government, and most 
of the secondary ones are society driven. Among them, two initia¬ 
tives were identified as maintained in partnership between govern¬ 
ment and society organizations (C/{6]i. Initiatives generally display 
their objectives (22 - C/^TJ, but only 11 explicitly mention their in¬ 
tended audience (C/{2]|. Also, almost all initiatives offer data for 
download (18), which favours Uf[T] and more than half of them 
(13) make visualization available, favouring Ulj2] 

Even considering the low number of initiatives evaluated, two 
outcomes drew the attention, regarding feedback and semantics. 
Commenting on data is allowed only in three initiatives, and the 
same number (but not the same ones) offers a data request form. 
No reporting issues mechanisms were found, revealing a strong ab¬ 
sence of feedback possibilities (CA| 1 6[i. 

The lack of semantics support (only three offered it - CA[12|>, or 
linkable data (again, only three had it - C/jTOj also may point that 

l2 Available at http://community.openspending.org/ 
map-of-spending-projects/ 


UF[3] is still far from reality. Ten initiatives use categories for the 
datasets, which at least facilitate some form of comparisons. 

Regarding the use perspectives, we can state: 

UI^T]- Transparency: The main requirements for this use perspec¬ 
tive - data on transaction level, machine readable formats, aggrega¬ 
tion levels - were accomplished by most of the open budget initia¬ 
tives. However, much work is still to be done concerning the feed¬ 
back handling. We can say that, for most of the analysed cases, 
stakeholders interested in auditing government and in translating 
data into more accessible formats are partially satisfied. 

Ur(2]- Participation: The requirements set for this use perspective 
enforced human readable formats, that allows citizens without deep 
budget knowledge to understand data and to participate in discus¬ 
sions. Slightly more than half of the initiatives present graphics, 
which can help quick insights over data. Only three initiatives offer 
maps to visualize budget data, what is coherent to the low number 
of initiatives that include the location dimension (eight). Another 
aspect emphasized in this use perspective was the usability and de¬ 
sign. Considering the already mentioned limitations on assessing 
this issue, we noticed that ten initiatives use standard open source 
software tools. Although this is not the most relevant factor regard¬ 
ing usability, the use of standard tools favours users dealing with 
several open budget initiatives. Moreover, as open source tools, the 
more initiatives using these tools, the better they can be developed. 

Ulj3]- Policy Making: The main requirements in this perspective 
were the use of common classifications, vocabularies and ontolo¬ 
gies, and the possibility of linking data with other databases. As 
already mentioned, semantics support was mostly absent. Com¬ 
parison tools, also important in this case, were found only in three 
of the initiatives. Thus, this use perspective is still far from be¬ 
ing realised in most of the analysed initiatives. All these indicate 
that working on standard terminologies and common conceptual¬ 
izations as suggested by OpenSpending G3 is highly desirable. 

8. CONCLUSIONS 

In this paper, we presented a model to analyse open budget ini¬ 
tiatives, including dimensions and assessable characterization at¬ 
tributes. The model covers, at this stage. General Aspects, Publish¬ 
ing and Consumption dimensions of these initiatives. Initial testing 
of the model and analysis of 23 open budget initiatives revealed 
that attention has to be given on feedback, semantics support and 
linking possibilities. 

Future research includes adding the Context dimensions in the 
model, and developing characterization attributes for it. We intend 
also to extend the assessment to other initiatives, as well as to con¬ 
duct a more detailed analysis of some of their characteristics. As 
the results pointed a very weak performance on the Feedback di¬ 
mension, we aim to further explore the Consumption part, in order 
to propose solutions that can contribute on this issue. The Usabil¬ 
ity dimension also needs more consistent characteristic attributes. 
The use of standard vocabularies or ontologies and linkable data 
formats also deserves attention. 

Regarding the use perspectives, we conclude that transparency 
requisites were mostly accomplished by the analysed initiatives. 
Participation, in turn, is still not heavily supported, while tools for 
comparing budget data by policy makers are still far from reality. 

It has to be noticed that materializing transparency is far more 
complex than just publishing budget data through software tools. 
Several political issues are related to data publishing, as well as 
deep data literacy questions are involved in the usage of open bud- 



get initiatives. Gurstein f8j alerts about the emergence of a "data 
divide", a parallel concept to the digital divide, distinguishing peo¬ 
ple "who have access to data which could have significance in their 
daily lives and those who don’t". Thus, transparency policies can 
not be implemented without actions to foster digital inclusion, and 
why not, "data inclusion". 

Software tools are, although indispensable, just part of the pro¬ 
cess. With this research effort, we aim to enrich the existing knowl¬ 
edge on open budget, and help to set the basis for developing tools 
and procedures that, together with other actions, may result in real 
benefits of fiscal transparency to the society. 
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Table 2: Results of the application of the model on some open budget initiatives. In C^|5] (1) local; (2) regional; (3) national; (4) transnational; (5) generic; and (L) legislative 
budget. In Cz\[6] (Gov) Government; and (Soc) Society. In CA|9] (Tr) transaction; (Ag) aggregate; and (Ge)generic. In CA |10J N/A means not applicable, when there is no data 
for download. In CA|15j (OS) OpenSpending; (CK) CKAN; and other used specific non open source software. This table can be accessed online at http : //bit. ly/IFNThhH. 















































































